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1. INTRODUCTION 

Because of its small size, short life cycle and compact 
genome, Arabidopsis thaliana (hereafter Arabidopsis) has 
been developed into a model system for plants. Since its nu- 
clear genome was completely sequenced 10 years ago [1], 
the quest to decipher the functions of all its genes has been a 
major strand in Arabidopsis research. Forward (from- 
phenotype-to-gene) and reverse (from-gene-to-phenotype) 
genetics have been the main workhorses en route to this 
goal. Even before the Arabidopsis Genome Initiative got 
underway, numerous forward genetic screens had identified 
mutants with abnormal phenotypes that resulted in the identi- 
fication and functional characterization of many genes for 
specific plant functions. Identification of the mutated genes 
is usually the most labour-intensive step in classical forward 
genetics. With increasing knowledge of the sequences of 
Arabidopsis genes and their transcripts, reverse genetic 
screens have become possible, which complement the classi- 
cal approach by permitting direct analysis of mutants for 
specific genes, selected for their potential impact on a bio- 
logical function of interest. According to the Annual Report 
2010 of The Multinational Coordinated Arabidopsis thaliana 
Functional Genomics Project (http://www.arabidopsis.org/ 
portals/masc/masc_docs/masc_reports.jsp), the experimen- 
tally verified functions of just over 9000 of the -27,000 
Arabidopsis genes had been annotated as of March 2010. 

In silico approaches can assess the functions of Arabi- 
dopsis genes whose homologs have been characterized in 
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other species, or which code for proteins with conserved 
domains, and gene functions can be predicted by correlating 
data derived from functional genomics analyses, such as 
large-scale mRNA, protein or metabolite profiling [2-4]. 
However, to decipher the biological function(s) of every 
gene in Arabidopsis will require much further effort, and 
many projects have been initiated to create the resources 
required for this formidable task [5]. Technological advances 
in recent years have led to the assembly of large collections 
of Arabidopsis mutants bearing molecularly mapped inser- 
tions. Besides total inactivation of the gene function, some of 
the newly developed mutagenesis techniques allow one to 
alter the dosage or function of gene products, making refined 
analyses of gene-function relationships possible. Several 
databases are now available that permit researchers to scan 
the Arabidopsis genome for mutations in genes of interest 
and obtain the corresponding mutant lines from public stock 
centres. 

To elucidate the function of the entire complement of 
Arabidopsis genes requires ways to systematically assess a 
wide range of phenotypes, which can be applied in a high- 
throughput manner to large numbers of mutants. Saturated 
collections of loss-of-function and gain-of-function lines will 
also be very valuable. A further challenge arises from the 
recognition that the vast majority of gene knock-outs in 
Arabidopsis do not give rise to obvious phenotypes, and 
might require functional characterisation by -omics-type 
analyses or the simultaneous inactivation/down-regulation of 
more than one gene. In this review, we summarize the state 
of the art in systematic gene-function analyses and highlight 
tools and trends in the application of systematic forward and 
reverse genetics to Arabidopsis. 
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2. TOOLS FOR THE ANALYSIS OF GENE FUNC- 
TIONS 

Several methods have been developed that enable one to 
change the amount or the nature of gene products either by 
altering the original gene through the introduction of muta- 
tions or by introducing transgenes. Since the Arabidopsis 
genome sequence was published most efforts have focused 
on generating loss-of-function lines for reverse genetics. In 
addition, methods for overexpressing genes have gained in 
importance more recently [6, 7]. 

2.1. Loss-of-Function Mutants (Knock-Out Approach) 

The most straightforward approach to investigating the 
function of a gene is to characterize the phenotypic changes 
associated with its total inactivation [8]. Loss of gene func- 
tion can be achieved by introducing point mutations or short 
insertions/deletions, typically by means of chemical or 
physical mutagenesis, or by disrupting genes by inserting 
larger DNA sequences, such as T-DNAs or transposable 
elements. 

2.1.1. Point Mutation and Short Insertions/Deletions 

Point mutations can be introduced with high frequency 
by mutagenization with chemicals such as ethyl methanesul- 
fonate (EMS), but only a small fraction of such mutations 
actually lead to total loss of gene function, e.g. when they 
generate a premature stop codon or alter functionally critical 
amino acids [9]. Similarly, short insertions/deletions usually 
eliminate gene function only if the coding sequence is af- 
fected and a frameshift results. Point mutations and short 
insertions/deletions have the disadvantage that the location 
of the mutation is random and cannot be predetermined. 
Therefore, in order to have a reasonable probability of find- 
ing loss-of-function mutations for a given gene, large num- 
bers of mutants must be generated. To remove the unwanted 
background mutations, several rounds of backcrossing to the 
wild type are usually required. 

The identification of individual, randomly induced point 
mutations or short insertions/deletions within the genome of 
a mutant is a laborious process involving map-based cloning 
[10], although recent advances in high-throughput sequenc- 
ing technology [10, 11] might make resequencing of the en- 
tire mutant genome the method of choice in future. Because 
Arabidopsis reference genome sequences (such as the Col-0 
ecotype) are already available, new generation sequencing of 
mutant genomes will be time and cost effective. Another 
approach to detect small changes within the genome is an 
adaptation of the microarray technology, which detects sin- 
gle nucleotide polymorphisms (SNPs) through hybridizing 
genomic DNA to oligonucleotides representing the entire 
genome spotted on chips [12]. With continuous advances in 
sequencing technologies, genome-based methods can replace 
the conventional marker-based genotyping approach. Never- 
theless, because EMS-mutated plants contain many addi- 
tional point mutations, it might be recommendable to narrow 
down first the mutation to a chromosome arm with map- 
based approaches and conventional techniques. 

For reverse genetic approaches employing EMS-induced 
mutations, the "Targeting Induced Local Lesions IN Ge- 
nomes" (TILLING) method has been developed [13]. Here, 
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the endonuclease CEL I identifies mismatches in heterodu- 
plexes between wild-type and mutant DNA sequences after 
PCR-based amplification of genome sequences of interest 
[14]. For systematic reverse genetics screens, each target 
coding sequences must be covered by PCR amplicons [15]. 
By September 2010, 9550 TILLING lines in which the muta- 
tion has been assigned to a specific gene had been donated to 
public stock centres (Table 1). 

Most point mutations do not cause complete gene knock- 
out, but rather a decrease in gene expression (see also Sec- 
tion 2.2) or a change in the function of the gene product due 
to alterations in its sequence. Amino acid exchanges can lead 
to hypomorphic alleles displaying a range of phenotypes, 
allowing the analysis of important domains of the protein. 
Such variants can be pinpointed by searching the TILLING 
database (Table 2). The recent release of the genome se- 
quences of 80 different Arabidopsis ecotypes (with more to 
come when the 1001 Genomes project is completed in 2011) 
makes this an important tool for understanding natural varia- 
tion of sequences and correlating them with evolutionary 
processes [16, 17] (Table 2). 

2.1.2. Insertional Mutagenesis 

Insertional mutagenesis has two major advantages: the 
mutations are labelled by the inserted fragment of known 
sequence ('tag') and insertions within the coding region have 
a high probability of eliminating the gene function. In Arabi- 
dopsis, T-DNA or transposons - mainly the Activator 
{Ac)IDissociation (Ds) system - are usually employed as 
insertional mutagens [18]. In an international effort involv- 
ing many laboratories using both systems, a large number of 
insertion mutants has been generated (Table 1) which now 
provide nearly genome-wide coverage - about 96% of all 
Arabidopsis genes according to the Annual Report 2010 of 
The Multinational Coordinated Arabidopsis thaliana Func- 
tional Genomics Project (http://www.arabidopsis.org/portals/ 
masc/masc_docs/masc_reports.jsp). The regions flanking the 
insertions (flanking sequence tags, FST) in over 325,000 
lines have been sequenced by several laboratories and 
mapped to the Arabidopsis reference genome to identify the 
precise insertion sites. These lines have been indexed, depos- 
ited in public stock centres and are accessible for Internet- 
based searches (Table 2). 

Because stock centres distribute T-DNA or transposon 
lines usually as populations that segregate for the mutation, a 
genotyping step is necessary to identify homozygous plants 
for the analysis of phenotypes. Recently, populations of ho- 
mozygous SALK lines were generated, which can be used 
directly for phenotypic screens [19]. To date (September 
2010) these homozygous lines represent 18,318 genes, with 
about 9000 genes covered by two alleles. However, a certain 
proportion of hemizygous lines escaped systematic genotyp- 
ing, so that a certain fraction - perhaps as large as 20% - 
might still not be homozygous (our unpublished results). 
Moreover, because T-DNA lines often contain more than 
one insertion, for the unambiguous assignment of pheno- 
types to mutations either at least two insertion alleles for 
each gene should be analysed or the mutation be comple- 
mented with the wild-type gene. 
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Table 1. Resources of Lines for Gene Function Analyses in Arabidopsis 



Collection 


Type 


Genetic 
background 


Number of lines 


Number of insertions 
mapped to genome 


Reference 


Loss-of-function or knock-down 


SALK lines (SALK Institute, USA) 


T-DNA 


Col-0 


137,259, of which 33,507 are 
homozygous (in progress) 


166,127 (in progress) 


[119] 


SAIL lines (Syngenta, USA) 


T-DNA 


Col-0, Col-3 


54,612, of which 3494 are 
homozygous 


62,041 


[120] 


Wisconsin lines 


Ds-Lox 


Col-0 


32,614 


21,406 


[121] 


GABI-KAT lines (GABI, Germany) 


T-DNA 


Col-0 


117,482 


69,877 (in progress) 


[122] 


JIC SM (SLAT collection) 


En/Spm 


Col 


-48,000; 24,000 single seed 
lines 


26,007 


[35] 


CSHL-lines (Martienssen Transposon 
Insertion lines) (either gene or enhan- 
cer trap) 


Ds-GUS 


Ler 


22,084 


27,398 


[62, 123, 
124] 


RIKEN transposon-tagged lines 


Ds 


No-0 


15,267 lines (in progress) 


15,267 lines (in progress) 


[118, 125- 
128] 


INRA- Versailles lines 


T-DNA with 
GUS 


Ws 


-55,000 


37,142 


[129, 130] 


EXOTIC 


Ds, Spm 


Ler, Col 


23,573 


in progress 


[35] 


AGRIKOLA 


RNAi 


Col 


22,478 constructs, of which 
2,519 are transformed into 
plants 




[131] 


amiRNA (CSHL-based 2010 project) 


amiRNA 




17,699 clones 




see Table 
2 


Seattle Arabidopsis TILLING Project 


EMS 


Col-0 erecta 


9,550 (in progress) 




see Table 

2 


gain-of-function 


FOX hunting lines 


flcDNA 

over- 
expression 


Col 


5,000 (in progress) from 
12,000 expected 




[33] 


Weigel lines 


T-DNA at 


Col-7 


16,398 in pools 




[29] 


RIKEN activation-tagged lines 


T-DNA at 


Col 


36,650 (in progress) 


1,172 (in progress) 


[28, 40] 


SK (Saskatoon) lines 


T-DNAat 


Col-4 


-50,000 (in progress) 


15,507 (in progress) 


[39] 


TAMARA 


En/Spm AT 


Col-0 


9,471 lines 




[37] 


CRES-T 


chimeric 
repressors 




395 (in progress) 




[132] 


Chromatin charting 


luciferase 


Col-0 


277 (in progress) 


in progress 


[42, 133] 



Collections with less then 1000 lines are not listed unless specifically referenced in the main text. Numbers of publicly available lines have been obtained from the corresponding 
websites/references, the stock centers NASC, ABRC, RIKEN Bioresource center and http://signal.salk.edu/Source/AtTOME_Data_Source.html (as of September 2010). T-DNAat> T- 
DNA activiation tagged; En/Spm AT , En/Spm activation- tagged. 
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Table 2. Databases/Websites for Gene Function Analyses 
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Resource/database 


Content 


Location 


Stock Centers and general information 


The Arabidopsis Information 
Resource (TAIR) 


Information, tools, databases 


http://www.arabidopsis.org/ 


Arabidopsis Biological Re- 
source Center (ABRC) 


Information/stock centre 


http://abrc.osu.edu/ 


European Arabidopsis Stock 
Centre (NASC) 


Stock centre 


http://arabidopsis.info/ 


Riken BioResource Center 


Stock Centre (Japan) 


http://www.brc.riken.go.jp/lab/epd/Eng/species/arabidopsis.shtml 


Arabidopsis thaliana Resource 
Centre for Genomics 


Databases, stock centre 


http://www-ijpb.versailles.inra.fr/en/sgap/equipes/variabilite/crg/ 


Collection of lines and/or phenotypes 


SALK T-DNA collection 


Information/databases on SALK T-DNA 
lines 


http://signal.salk.edu/tabout.html 


GABI-KAT lines 


Information/Database for GAB-KAT lines 


http://www.gabi-kat.de/ 


Arabidopsis Genetrap Website 
at Cold Spring Harbor Lab 


GUS patterns, phenotypes and mapping 
data of transposon insertions 


http://genetrap.cshl.edu/ 


T?TT^PM IrntiQnnsnii. nnH nptivn- 

tion-tagged lines 


Tntnrmntinn nn linpc nnrl T?TT^"PM ArnhiHnn- 

sis Phenome Information Database 
(RAPID) with phenotypes of 4000 lines 


httrv//n~l~CTWph ocr rilrpn on in/inrlpY html 

http://pfgweb.gsc.riken.go.jp/pjAct.html 
http://www.brc.riken.jp/lab/epd/Eng/catalog/transp.shtml 
http://rarge.gsc.riken.go.jp/dsmutant/index.pl 
http://rarge.psc.riken.jp/phenome/ 


FOX hunting lines 


Project description 


http://pfgweb.gsc.riken.go.jp/pjFox.html 


INRA Versailles lines 


Information on INRA lines, phenotypes in 
Agrobact+ database 


http://www-ijpb.versailles.inra.fr/en/sgap/equipes/variabilite/crg/ 
http://dbsgap.versailles.inra.fr/agrobactplus/English/Accueil_eng.jsp 


SK (Saskatoon) lines 


Project description and sequence informa- 
tion 


http://aafc-aac.usask.ca/FST/ 


EXOTIC 


Project description 


http://www.jic.bbsrc.ac.uk/science/cdb/exotic/index.htm 


Seattle Arabidopsis TILLING 
Project 


Information on how to find point mutations 
in gene-of-interest 


http://tilling.fhcrc.org/ 


Arabidopsis Genomic RNAi 
Knock-out Line Analysis 


RNAi lines 


http://www.agrikola.org 


(AGRIKOLA) 






amiRNA (CSHL-based 2010 


Project description and vector information 


http://2010.cshl.edu/arabidopsis/2010/scripts/main2.pl 


project) 




http://www.openbiosystems.com/RNAi/ArabidopsisthalianaamiRNA/ 


fioreDB 


Project description and database 


http://www.cres-t.org/index_e.html 


Bioassay and Phenotype Data- 
base (BAP DB) 


Information on gene functions based on 
phenotype and screening data 


http://bioweb.ucr.edu/bapdb 


Specialized projects 


SeedGenes Project 


List of essential genes 


http://www.seedgenes.org/ 


Chloroplast 2010 project 


Project description and database 


http://www.plastid.msu.edu/ 


Chloroplast Function 


Project description and database 


http://rarge.psc.riken.jp/chloroplast/ 


Database 






GAL4-GFP Enhancer-trap lines 


Project description, vector information and 
database for images 


http://www.plantsci.cam.ac.uk/Haseloff/construction/catalogFrame.html 


1001 Genomes Project 


Whole-genome sequence variation in 1001 
accessions of A. thaliana 


http://ww w. 1001 genomes.org/ 
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2.2. Mutants with Reduced Gene Expression (Knock- 
Down) 

The utility of loss-of-function lines is limited when muta- 
tions result in lethality, or in cases of genetic redundancy, 
when more than one copy of the gene exists as a result of 
tandem or segmental duplications or the evolution of 
multigene families. To overcome these drawbacks, trans- 
gene-mediated gene silencing can be used to reduce but not 
completely abolish ("knock-down") the expression of 
gene(s) of interest. Such "knock-down" lines allow one si- 
multaneously to down-regulate several, sequence-related 
genes and thus avoid the complications associated with ge- 
netic redundancy. Because knock-down lines are not null 
mutants and show a diverse pattern of phenotypes due to 
differences in the levels of activity remaining, the more sub- 
tle phenotypes induced by the partial or conditional inactiva- 
tion of essential genes can also be studied. 

Silencing is normally achieved by post-transcriptional 
down-regulation of transcript accumulation via small RNAs 
that act in a sequence-specific manner by base pairing to 
complementary mRNA molecules. Various strategies for 
small RNA-based gene silencing have been developed [20]. 
The first to be developed was the antisense approach, in 
which part of a gene was expressed in reverse orientation, 
usually under the control of the Cauliflower Mosaic Virus 
(CaMV) 35S promoter [21]. Later the RNA interference 
(RNAi) approach was introduced. This uses a binary hairpin 
RNA vector into which gene-specific tags (GSTs) are cloned 
[22]. Transformation of Arabidopsis with RNAi constructs is 
being systematically conducted by the AGRIKOLA (Arabi- 
dopsis Genomic RNAi Knock-Out Line Analysis) Consor- 
tium to create a collection of silenced Arabidopsis lines 
which are available through the Nottingham Arabidopsis 
Stock Centre (NASC) [23] (Tables 1 and 2). A newer ap- 
proach involves the use of artificial microRNAs (amiRNAs) 
[20, 24, 25]. The systematic design of amiRNA has led to the 
generation of 17,699 clones for use in Arabidopsis (as of 
June 2010), providing an important tool for future functional 
genomics studies (Table 1). The Chimeric Repressor Silenc- 
ing Technology (CRES-T) represents a novel method for 
creating loss-of-function mutations for the analysis of redun- 
dant plant transcription factors [26]. With CRES-T, a tran- 
scription activator of interest can be fused to a peptide or 
protein that converts it into a transcriptional repressor, which 
dominantly suppresses the expression of its target genes even 
in the presence of the redundant transcription factor. The 
phenotypic information obtained from the application of 
CRES-T in Arabidopsis is stored in the Web-based interface 
FioreDB [27] (Tables 1 and 2). 

2.3. Gain-of-Function Lines 

Gain-of-function lines can be used to dissect the func- 
tions of genes that are genetically or functionally redundant, 
i.e. which are members of gene families or are tandemly or 
segmentally duplicated, or whose function can be compen- 
sated for by alternative regulatory pathways [28]. The pheno- 
types caused by gain-of-function mutations segregate as 
dominant traits [8, 29] and individual members of a gene 
family can produce a gain-of-function mutant phenotype that 
cannot be modified by other members of the family [30, 31]. 
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Often the mutant phenotypes induced by loss-of-function 
and gain-of-function approaches are complementary to each 
other. Because of the dominant nature of the gain-of-function 
phenotypes, there is no need for an additional genotyping 
step to identify lines homozygous for the transgene. 

Overexpression of a gene driven by a strong promoter 
can result in gain-of-function phenotypes, especially when 
expression is also ectopic. This strategy requires the use of 
full-length cDNA clones. The RIKEN Arabidopsis full- 
length (RAFL) collection now makes large numbers of such 
cDNA clones available. The RAFL collection was generated 
by inserting full-length cDNAs in the correct orientation 
between the CaMV 35S promoter and the NOS terminator 
and verifying their sequences [32]. About 10,000 different 
RAFL cDNA clones were then pooled and transformed into 
Arabidopsis to generate so-called 'FOX' (Full-length cDNA 
Over-eXpressing gene) lines [33]. The FOX hunting system 
is useful for systematic phenotypic analysis of the function 
of each inserted gene. Visible phenotypes were induced in 
about 9% of transformed lines and details of these have been 
collected in a database (see Section 3.2 and Table 2). 

In the activation tagging approach, DNA sequences car- 
rying enhancers (e.g. four tandemly arranged copies of the 
CaMV 35S enhancer) are inserted randomly into the genome 
[34], Large numbers of activation-tagging lines have been 
produced, and phenotypes observed could be attributed to 
specific genetic events by sequencing the regions flanking 
insertions and mapping these to the reference genome [28, 
29, 34-39]. The distances between insertions and the genes 
affected were found to vary from 0.4 to 8.2 kb. Enhancers 
can control the expression of genes on both sides of the in- 
sertion and, indeed, in some cases more than one gene is 
affected [29, 40]. In addition to dominant gain-of-function 
phenotypes, knock-outs are also observed in these lines, 
when an insertion is located within a coding region. Between 
0.1 and 1 % of the activation tagging lines showed visible 
phenotypes [28, 29, 35-38]. Recently, a new activation- 
tagging method has been developed which uses pEnLOX 
lines containing four tandemly repeated CaMV 35S tran- 
scriptional enhancers flanked by two loxP sites, and pCre 
lines that contain the CRE gene [41]. By crossing pENLOX 
lines to pCre lines, the CaMV 35S enhancers can be deleted, 
causing reversion to the wild-type phenotype, and allowing 
for speedier confirmation of gene-function relationships. 

The Chromatin Charting Project has set itself the task of 
producing over 15,000 Ds-tagged lines of Arabidopsis, each 
of which contains within the Ds-tagging cassette a CaMV 
35S::luciferase gene (which allows monitoring of position 
effects) and a lac operator repeat (which allows the tagged 
loci to be visualized by expressing fusions of fluorescent 
proteins with LacI within the nuclei of living plants). The 
selected lines will be used to evaluate transgene position 
effects and how these are affected by developmental or ex- 
ternally applied cues; since position effects on gene expres- 
sion arise from alterations in chromatin architecture, this 
approach provides insights into global epigenetic control [42, 
43] (Table 1). 

3. SYSTEMATIC FORWARD GENETIC SCREENS 

Forward genetic screens have traditionally been used to 
identify genes involved in a specific biological process of 
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interest. A range of mutant collections have been analysed, 
including chemically or physically mutagenized populations 
- which have the advantage that it is, in principle, possible to 
attain saturation, i.e. to generate multiple mutant alleles for 
each gene - and tagged mutants, which allow the mutated 
gene to be identified in a relatively straightforward manner. 
Screens for a wide variety of phenotypes have been carried 
out in Arabidopsis, and only a selection can be discussed 
here. 

3.1. Generic Screens 

The RIKEN Activation Tagged Lines, comprising about 
50,000 independently transformed Arabidopsis lines [28, 40] 
(see Table 1), have been screened for visually discernible 
phenotypes. In all, 1262 lines exhibited visible phenotypes 
(e.g. abnormalities in morphology, growth rate, plant colora- 
tion, flowering time or fertility) and sequence information is 
available for 1172 mutant loci (Table 2). Of four dominant 
mutants that showed hyponastic leaves, downward-pointing 
flowers and decreased apical dominance, two were caused by 
activation of the gene ASYMMETRIC LEAVES2 (AS2) and 
two by activation of ASL1/LBD36 [28]. 

The FOX lines (see Section 2.3, Table 1) were also 
screened for abnormal morphologies, fertility and leaf col- 
oration. A total of 1487 candidate morphological mutants 
were found among 15,547 transformants, and of 115 pale 
green Ti mutants, 59 lines displayed the mutant phenotypes 
in more than 50% of the T 2 progeny, suggesting that the 
phenotypes were caused by gain of function mutations (see 
Section 2.3). Two leaf coloration mutants were further char- 
acterized and one mutant was found to result from up- 
regulation of AtPDHl (Arabidospsis prokaryotic DEVH box 
helicasel); knock-out of this same gene resulted in an albino 
phenotype [33]. 

3.2. Screens for Developmental Phenotypes 

The Seed GenesProject [44-47] was designed to identify 
EMBRYO DEFECTIVE (EMB) genes required for embryo or 
seed development, and the December 2007 release of its da- 
tabase SeedGenes (Table 2) includes information on 358 
genes required for seed development and 605 mutant alleles 
with known disruptions in these genes. EMB genes encode 
proteins involved in basic cellular processes, such as DNA 
replication, RNA processing and protein synthesis, as well as 
chloroplast proteins, indicating that a functional chloroplast 
is needed for normal embryo development in Arabidopsis. 

Furthermore, over 500 seedling-lethal mutants were iden- 
tified in a set of 38,000 insertion mutants, of which 54 were 
molecularly characterized and 22 mutants were described in 
detail [48]. Many of the mutants displayed altered pigmenta- 
tion and affect genes encoding proteins predicted to reside in 
the chloroplast. 

To isolate mutants defective in early steps of meiotic 
recombination, a specialized two-step genetic screen em- 
ploying 55,000 T-DNA insertion lines has been carried out 
[49]. In the first step, all lines were screened for fertility de- 
fects on the basis of reduced silique elongation. In the sec- 
ond step, differential interference contrast microscopy was 
used to analyse male meiotic products in developing pollen 
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mother cells. Four genes involved in the repair of meiotic 
DNA double-stranded breaks were identified, together with 
five genes necessary for formation of meiotic DNA double- 
strand breaks [49] . 

3.3. Screens for Genes Important for Male or Female 
Gametophyte Development and Reproduction 

The life cycle of plants alternates between a haploid 
gametophyte and a diploid sporophyte phase. Mutations that 
eliminate or disrupt the function of the haploid pollen (male 
gametophyte) or of the embryo sac (female gametophyte) 
can only be maintained in the heterozygous state, because 
such mutations cannot be transmitted through the defective 
gamete (reviewed in [50-52]). In addition, mutations in 
sporophytically expressed genes might result in male or fe- 
male sterility [53, 54]. 

Visual inspection of siliques for 50% or more desiccated 
ovules, as well as screens for aberrant transmission of an 
antibiotic resistance marker gene, led to the identification of 
gametophytic mutants [55-61]. In Arabidopsis, due to the 
widely used "floral dipping" transformation protocol the 
primary target of T-DNA integration is the embryo sac. 
Therefore, it can be difficult to recover T-DNA insertional 
knock-out lines, in which female gametophyte function is 
severely compromised [56]. To overcome this problem, 
transposon insertion lines have been used. Ds transposable 
elements are mobilized during the sporopyhtic stage of a 
plant's life cycle and most Ds transposon insertion lines pos- 
sess further advantages like single-copy insertions without 
generating DNA rearrangements or truncations [62]. 24,000 
Ds insertion lines containing either a Ds gene trap or an en- 
hancer trap element were screened and 130 Arabidopsis mu- 
tants with defects in female gametophyte development and 
function were identified [63]. The inheritance of the Ds ele- 
ment could be followed by the linked antibiotic resistance 
marker. Given the dominant mode of inheritance of the resis- 
tance phenotype, a segregation of 3:1 is expected in the 
progeny of a heterozygous Ds insertion line. Aberrant segre- 
gation ratios lower than 2:1 are often the result of a gameto- 
phytic defect. For 1.38% of the screened lines a segregation 
ratio of 1.5:1 or less was found and a wide variety of mutant 
phenotypes were observed, such as defects in different stages 
of embryo sac development and in processes such as pollen 
tube guidance, fertilization or early embryo development. 
Besides female gametophytic mutants, 109 lines showing 
predominant defects in male gametophyte transmission were 
also identified in this collection [64]. 

3.4. Screens for Altered Metabolism or Stress Tolerance 

In a screen of the TAMARA gain-of-function collection 
[37] for mutants with enhanced levels of phenolic com- 
pounds, up-regulation of an R2R3-MYB transcription factor 
was found to be responsible for the phenotype [28, 37, 65]. 
Genes controlling proanthocyanidin biosynthesis have also 
been identified in both the TAMARA and Saskatoon collec- 
tions (Table 2) [39]. 

A small subset of the FOX lines has also been assessed 
for mutants affecting stress-inducible transcription factors. 
Among 43 cDNAs tested in a transgenic plant library, the 
authors identified salt-tolerant lines, all of which overex- 
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pressed the same bZIP-type transcription factor, AtbZIP60 
[66]. 

The COS (controlled cDNA overexpression) system is 
based on a cDNA expression library that is driven by an in- 
ducible promoter. Sets of 20,000 to 40,000 transgenic 
seeds/seedlings were screened in three different ways, 
namely for ABA insensitivity, salt tolerance and for activa- 
tion of a stress-responsive alcohol dehydrogenase-luciferase 
reporter system [67, 68]. Twenty-seven cDNAs conferring 
dominant, inducible stress-tolerance phenotypes were identi- 
fied, and recloning of several of these cDNAs confirmed the 
observed phenotypes [67] . 

3.5. Chemical Genetic Screens 

Treatment of mutant collection with a chemical com- 
pound to elucidate gene functions and signal transduction 
pathways are widely applied [69]. Limitations of this ap- 
proach can be limited uptake of the compound by the plant 
or transport over membranes, and that possible modifications 
such as acetylation might inactive the compound. A prereq- 
uisite for these so called "chemical genetic screens" is the 
availability of a library of chemicals. The chemicals are usu- 
ally used to screen for changes of a specific physiological 
parameter. For strigolactones, for example, that play a role 
during parasitic seed germination of plants such as Striga, a 
collection of related small molecules - cotylimides - that act 
as genetic suppressors of strigolactone levels were identified 
this way [70]. These cotylimides were used in a suppressor 
screen with T-DNA mutants, and light-signaling genes were 
identified as positive regulators of strigolactone levels [70]. 
Another example is the compound pyrabactin. Pyrabactin 
has been isolated as a synthetic seed germination inhibitor 
that mimics abscisic acid (ABA) in a highly selective way, 
thereby activating the ABA pathway. Arabidopsis EMS mu- 
tants were screened with pyrabactin and a new component of 
the ABA pathway, the PYRABACTIN RESISTANCE 1 
(PYRI) gene was identified. It codes for a START protein, 
which was characterized as the long-elusive ABA receptor 
[71]. 

Besides EMS or insertion mutants multiple Arabidopsis 
natural accessions are a rich source for applying chemical 
genetic screens. When Arabidopsis natural accessions were 
subjected to a library of more than 10,000 compounds, 
twelve accession-specific molecules were identified [72], 
The natural resistance to hypostatin, an inhibitor of cell ex- 
pansion, was identified in this way and characterized as 
HYR1, a UDP glycosyltransferase. 

Chemical molecules that inhibit a specific protein func- 
tion result in a similar phenotype like corresponding loss-of- 
function mutants. Vice versa, plants treated with a compound 
that activates a protein function resemble gain-of-function 
mutants. Variation in concentration of the chemical com- 
pound can act similar to allelic series of mutations. If a 
chemical is able to interfere with the function of all members 
of a protein family, it can substitute for mutants of a higher 
order. The ability to perturb protein functions only in the 
presence of a chemical and therefore for a limited time al- 
lows the study of essential genes that would display a lethal 
phenotype if permanently mutated. Nevertheless, currently 
only a limited number of chemical compounds with a de- 
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fined function are available to be applied for biological ques- 
tions. Helpful are databases such as ChemMine [73] or Pub- 
chem (http://pubchem.ncbi.nlm.nih.gov/). 

3.6. Non-Invasive Screens 

Ideally, high-throughput screens should not damage the 
plants investigated so that they can be used directly for the 
next steps in forward genetics. Besides visual screens for 
altered growth or coloration phenotypes, other non-invasive 
screening procedures have been employed. For instance, 
systematic measurements of chlorophyll fluorescence have 
been used to determine the efficiency of photosynthetic elec- 
tron flow. To this end, image analysis [74, 75] and a semi- 
automated pulse-amplitude modulated fluorometer device 
[76] have been developed. Chlorophyll fluorometry has also 
been used for the quantitative assessment of drought survival 
[77]. The luciferase and green fluorescent protein (GFP; and 
its derivatives) reporter genes allow one to monitor changes 
in the expression of genes by measuring luminescence and 
fluorescence, respectively, and these have been utilized to 
identify mutants with deregulated gene expression or altered 
subcellular localisation of gene products (e.g. [78, 79]). Es- 
pecially the luciferase screen with its very short response 
time can be employed to test for transient inductions. This 
system has been applied in stress or ABA-related mutant 
screens [80, 81], as well as in circadian rhythm related mu- 
tant screens [82-84]. 

Some of the lines generated within the SAIL project were 
transformed with a vector that carries the LAT52:GUS re- 
porter gene, a visible cell-autonomous marker for pollen 
grains and tubes [44]. This population was generated in the 
qrtl-2 mutant background - a mutant, which maintains the 
four male meiotic products in a tetrad [85] and which is very 
useful for non-invasive screens during male gametophyte 
development. Mutant lines can be classified as either 
homozygous or hemizygous for their T-DNA insertion by 
staining their pollen grains. Consequently, the hapless {hap) 
mutations was isolated, which impairs the development or 
function of haploid gametophytes [86]. 

The collection of GAL4 enhancer trap lines harbors a 
GAL4-responsive GFP gene as a marker to tag specific cell 
types and to reveal developmental transitions [87] . The lines 
can be screened microscopically or even viewed as images 
(http://www.plantsci.cam.ac.uk/Haseloff/geneControl/catalo 
gFrame.html) and are very useful to identify enhancers for a 
specific cell type. The screen has been successfully applied 
and resulted in the isolation of Dof elements, which are in- 
volved in the regulation of guard cell gene expression [88]. 
In a further step some well characterized lines with specific 
expression patterns can be used as starter lines for EMS 
mutagenesis. This approach was applied to dissect the 
mechanisms that specify the different populations of pericy- 
cle cells. Thus, a pericycle-specific enhancer trap line was 
mutagenized and several mutants exhibiting qualitative or 
quantitative alteration of the GFP expression pattern were 
isolated [89]. 

4. SYSTEMATIC REVERSE GENETICS SCREENS 

Efficient systematic reverse genetics requires the avail- 
ability of gene-indexed mutant populations, and the use of 
robust, high-throughput assays for phenotypic characteriza- 
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tions. Studies of this type in Arabidopsis have benefitted 
from previous experience with the model organisms 
Caenorhabditis elegans [90] and Saccharomyces cerevisiae 
[91]. In particular, T-DNA mutant collections have been 
widely used for reverse genetics, as indicated by the more 
than 2000 cumulative citations in the literature [9]. The ma- 
jority of genes in the Arabidopsis genome are members of 
gene families [1], so that redundant gene functions often 
have to be assessed by generating double or higher-order 
mutants from T-DNA lines. 

In the course of systematic reverse genetic screens, lines 
from different collections (see Table 1) are usually com- 
bined, in order to achieve maximum coverage of the target 
genes. In addition, for genes that are not covered by existing 
mutant collections, partial loss-of-function lines are often 
generated by the knock-down approach (see Section 2.2). 

4.1. Small- and Medium-Scale Screens 

Numerous small- and medium-scale reverse genetic 
screens have been performed which have addressed specific 
biological functions or multigene families. Only a small se- 
lection can be discussed here (Table 3). One of the first sys- 
tematic screens with insertion-tagged populations was per- 
formed more than 20 years ago. A collection of 8,000 Arabi- 
dopsis plants carrying 48,000 insertions of the maize trans- 
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posable element En-1 was screened for knock-out alleles of 
genes involved in flavonoid biosynthesis, utilizing a PCR- 
based screening protocol [92]. Another early example in- 
volved the systematic analysis of nuclear genes for photosys- 
tem I subunits, and used T-DNA or transposon insertion 
lines, together with knock-down lines [93, 94]. 

For reverse genetic analysis of the nine members of the 
1-aminocyclopropane-l-carboxylate synthase (ACS) gene 
family, seven T-DNA mutants and two amiRNA lines were 
employed [95] (Table 3). A larger screen was conducted for 
subtilisin-like serine proteases (subtilases), which comprises 
56 members in Arabidopsis. Here 144 T-DNA insertion lines 
for 55 subtilase genes were identified. With the exception of 
SDD1, none of the lines displayed an obvious visible pheno- 
type in the homozygous state [96]. This result highlights the 
need to use robust, high-throughput phenotypic screening 
methods to assign a phenotype, i.e. a biological function, to 
each gene, even in medium-scale screens. 

In some cases, where genes with a known overall func- 
tion were characterised by reverse genetics, an in-depth de- 
scription could be provided for every member of the set. 
This was achieved, for instance, for all genes coding for pho- 
tosystem I subunits (see above; reviewed in: [93]). The op- 
posite case is represented by the so-called "guilt-by- 
association" approach, which emerged recently and is based 



Table 3. Selected Reverse Genetics Screens in Arabidopsis 



Gene family / biological process 


Numbers of screened genes 


Reference 


Small scale screen (less than 20 genes) 


NADP-malic enzyme / respiration 


4 


[134] 


Sucrose synthase / carbon metabolism 


6 


[135] 


Type A response regulator / cytokinin signaling 


6 of 10 


[136] 


Aminocyclopropane-l-carboxylate synthase (ACS) gene family / ethylene 
biosynthesis 


9 


[95] 


Photosystem I subunit (nuclear encoded) / photosynthesis 


11 


[93, 94] 


nucleobase-ascorbate transporter / unknown biological function 


12 


[137] 


Laccase / glyocoprotein 


12 of 17 


[138] 


Auxin response factor / auxin-mediated transcriptional activation/repression 


18 of 23 


[139] 


Medium scale screens (20 to 100 genes) 


chloroplast protein kinases and phosphatases / reversible protein phosphoryla- 
tion 


22 


[140-142] 


Pentatricopeptide repeat proteins / organelle biogenesis 


39 of 450 


[143] 


Subtilases / serine proteases 


55 


[96] 


pollen tube growth 


33 


[102] 


P450 cytochrome oxidases 


91 


[104] 


Large-scale screens (more than 100 genes) 


Chloroplast function 


2733 


[105] 


Chloroplast function 


1369 


[107] 
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on the concept that genes with similar functions often behave 
similarly with respect to their associated transcript, protein 
or metabolite profiles as measured by -omics technologies. 
Co-expression analysis combines gene expression measure- 
ments from a large number of experiments thereby providing 
improved statistics. Thus, co-expression analysis can be used 
to predict the function of genes based on the similarity of 
their expression patterns [97, 98]. 

For instance, it was shown that nuclear genes for chloro- 
plast proteins involved in photosynthesis or plastid gene ex- 
pression indeed exhibit a high level of co-expression at the 
transcript level [99]. This finding was then exploited to sys- 
tematically characterize by reverse genetics genes of un- 
known function that exhibited photosynthesis gene-like tran- 
scriptional profiles, and led to the identification of PGRL1, a 
central component of cyclic electron flow around photosys- 
tem I (PSI) [100]. A gene for a plastidic bile-acid transporter 
was also isolated based on a co-expression approach [101]. 
Moreover, a set of genes, which are specifically expressed in 
pollen tubes in response to their growth in the pistil but not 
expressed during other stages of pollen or plant develop- 
ment, was identified. For 33 pollen tube-expressed genes the 
respective mutants were analyzed in a reverse genetics ap- 
proach. The mutants were derived from a subset of the SAIL 
collection that was generated in the quartet (qrt) background 
containing a LAT52 promoter driven GUS reporter gene 
[44]. Indeed, seven out of 33 investigated genes were critical 
for pollen tube growth [102]. Other examples are reviewed 
in [103]. However, the "guilt-by-association" approach often 
involves screening large numbers of genes, and very few 
mutants exhibit the "desired" phenotype. 

An example of chemical genetic screening in combina- 
tion with reverse genetics involves Arabidopsis lines over- 
expressing 91 P450 cytochrome oxidases. In the course of 
this screen two P450 cytochrome oxidases that hydroxylate 
the therapeutic compound 8-methoxypsorelen were identi- 
fied [104]. This demonstrates the potential of chemical ge- 
netic screening in combination with reverse genetics to as- 
sign specific functions to individual members of a large pro- 
tein family. 

4.2. Large-Scale Screens 

In the course of the Chloroplast 2010 Project mutations 
in nuclear genes that were computationally predicted to en- 
code chloroplast-targeted proteins were subjected to a di- 
verse set of phenotypic screens [105, 106]. During this study 
a total of 2733 mutants were found to be homozygous for the 
inspected T-DNA insertion, and 85 phenotypic characters 
were evaluated in each. The phenotypic tests involved quan- 
titative measurements of metabolites, such as fatty acid 
methylesters in leaves or amino acid profiles in seeds, and 
quantification of photosynthetic parameters by a chlorophyll 
fluorescence assay [106]. Data are available on the project's 
website (see Table 2). Among the confirmed mutants ob- 
tained by the Chloroplast 2010 Project are mutants for 
Atlgl0310 (previously described as a pterin aldehyde reduc- 
tase involved in folate metabolism) that were found to have 
an altered fatty acid desaturation phenotype. Single inser- 
tions in the gene Acyl Carrier Protein4 (ACP4) were respon- 
sible for defects in growth, fatty acid profiles and chloro- 
phyll fluorescence, previously associated with the knock- 
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down of the Acyl Carrier Protein gene family. However, the 
case studies of the Chloroplast 2010 Project also stressed the 
difficulty of securely establishing genotype-phenotype rela- 
tionships. It became clear, that - as in traditional reverse ge- 
netic screens - it is essential to confirm the genotype in ad- 
vance and to exclude the possibility that a secondary muta- 
tion causes the phenotype. A second independent collection 
of 3246 DslSpm- or T-DNA-tagged homozygous lines for 
1369 nuclear-encoded chloroplast proteins has been initiated 
[107]. Phenotypes were monitored in 3-week-old seedlings 
grown on agar plates and images are presented on the Chlo- 
roplast Function Database (http://rarge.psc.riken.jp/ chloro- 
plast/). 

5. NEW TRENDS 

5.1. The Unimutant Collection - not Quite the Ultimate 
Tool? 

The isolation of two homozygous insert lines for every 
gene in the Arabidopsis genome would make it possible to 
systematically test the phenotypic consequences of gene loss 
under a wide variety of conditions by forward genetics [19]. 
The SALK unimutant collection includes 18,318 genes, but 
so far only about 9000 of these are covered by two alleles. 
Once the unimutant collection is complete, it can be screened 
on several large-scale platforms for high-throughput assays 
e.g. [108-111]. The major advantage of screens based on the 
unimutant collection is obvious: exhaustive screens can be 
carried out with high efficiency, because loss-of-function 
lines for all Arabidopsis genes can be tested relatively 
quickly. But loss-of-function lines have their limitations. For 
instance, they provide only the first step in the analysis of 
essential genes (see Section 2.2). Secondly, the function of 
many genes will be very difficult to assess because of their 
apparent phenotypic silence in many assays. Thirdly, genetic 
redundancy will continue to hamper attempts to assign func- 
tions to members of gene families (see Section 5.2). 

5.2. From the Unimutant to the Uni-Multimutant Collec- 
tion: A Closer Approach to the Ultimate Tool 

How can the problem of genetic redundancy be solved 
for thousands of Arabidopsis genes? In the simplest case, 
when two unlinked genes code for the same protein, the sys- 
tematic generation of double mutants will make such genes 
manageable. In fact, in the course of the German Plant Ge- 
nome Initiative (GABI), one ongoing project is dedicated to 
the systematic generation of segmentally duplicated genes. 
Segmental duplications account for the duplication of around 
2900 genes in the Arabidopsis genome and can encompass 
chromosomal segments with up to about 300 genes which 
have been duplicated as blocks [112, 113]. In the GABI- 
DUPLO project, corresponding double mutants are being 
generated for 750 such gene pairs by crossing of T-DNA- 
based single mutants, selfing and PCR-based genotyping, 
representing a valuable complement to the unimutant collec- 
tion (https://www.gabi-kat.de/duplo.html). 

Closely linked gene duplications, including tandem du- 
plications as well as gene families containing more than two 
closely related genes, can only be handled by gene silencing 
approaches, and here the AGRIKOLA collection or amiRNA 
lines (see Section 2.2) promises to meet this need. 
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5.3. Targeted Mutagenesis 

Site-directed mutagenesis remains a significant technical 
challenge in higher plants. Many attempts have been made to 
establish targeted mutagenesis in Arabidopsis, but no routine 
approach is available yet. Recently, zinc-finger nucleases 
(ZFNs) were used for targeted gene inactivation in Arabi- 
dopsis [114]. In this study it was shown that ZFNs engi- 
neered to act on two different genes and driven by a heat- 
shock promoter induced mutations in their target genes at 
frequencies of 3% and 2.6%, respectively. 

The use of transcription activator-like effectors (TALEs) 
represents another approach with great potential. TALEs are 
injected into plant cells via the type III secretion pathway 
found in many plant pathogenic Xanthomonas spp. Once 
inside, they may contribute to disease or trigger resistance by 
binding to DNA and turning on TALE-specific host genes. 
Recent advances in understanding of the mechanistic basis of 
specific DNA binding by TALEs [115] should permit engi- 
neering of TALEs for arbitrary gene activation or as transla- 
tional fusions for directing other proteins, such as negative 
regulators, methylases or nucleases, to specific DNA targets 
(reviewed in [116]). 

5.4. Cataloguing of Mutant Phenotypes 

Because very many research groups generate data from 
forward and reverse genetics experiments, a central deposi- 
tory for documentation of mutant phenotypes is highly desir- 
able. To this end, The Arabidopsis Information Resource 
(TAIR) recently began to assign a phenotype to each mutant 
allele and made pictorial records of the mutant phenotypes 
available online (Table 2; [117]). Curators at TAIR have 
been capturing phenotypic information from the Arabidopsis 
literature since 2005. As of July 2009 the TAIR database 
contained 7352 distinct free-text phenotypes associated with 
11,381 distinct genotypes derived from almost 1500 publica- 
tions. 

Phenotypic databases already exist for individual mutant 
collections. These include, for instance, the Riken Arabidop- 
sis Phenome Information Database (RAPID) (Table 2), 
which is a searchable database describing morphological 
phenotypes of -4,000 Ds transposon mutant lines [118]. 
Phenotypic descriptions of these lines have been classified 
into eight primary categories (such as seedlings, leaves, 
stems, flowers and siliques) and fifty secondary categories. 
Images of individual plants are also available and can be 
searched by the line number or the phenotypic categories. 
Similarly, the Bioassay and Phenotype Database (BAP DB) 
(Table 2) is a database for exploring gene functions based on 
available phenotypes and for screening data from mutant, 
transgenic and wild-type organisms. It currently contains 
some project-specific phenotypic data for mutants under 
various abiotic stresses. BAP DB also allows users to upload 
their own assay and phenotypic data. 

5.5. The Future of Systematic Phenotypic Screens 

Assuming that one needs to screen a minimum of two 
homozygous individuals for two different mutant alleles for 
each of the approximately 27,000 genes in Arabidopsis, a 
total of 108,000 individual plants would have to be planted 
to analyse the completed unimutant collection (see Section 



5.1). When grown in standard trays, a total greenhouse area 
of 420 m 2 would be needed - a requirement which becomes 
feasible if plants are screened consecutively and not in paral- 
lel. A space-saving alternative are screens which can be per- 
formed on seedlings using either media plates or liquid as- 
says in media-containing multi-well plates, to which differ- 
ent compounds to generate abiotic or biotic stresses can be 
easily added. Moreover, because the unimutant collection 
would need to be complemented with double mutants for 
segmentally duplicated genes (like the GABI-DUPLO lines) 
and RNAi lines for members of very similar members of 
multigene families (like AGRIKOLA or amiRNA lines), the 
numbers listed above might have to be increased by 25% to 
screen a uni-multimutant collection. Exhaustive characterisa- 
tion of such populations can only be achieved by the collabo- 
rations involving many laboratories, each covering a specific 
set of phenotypic screens according to their specific exper- 
tise. Such screens would in many cases be repetitions of al- 
ready established phenotypic assays but, because of the large 
number of plants to be considered, automated, innovative 
and non-invasive screens will have to be developed. Such 
screens will combine the capture of 'classical' visible pheno- 
types (such as growth rate, plant and root architecture and 
flowering time), specialized non-invasive assays (such as the 
ones to measure photosynthesis) and 'chemical' phenotyping 
(such as the profiling of metabolites, proteins and tran- 
scripts). In addition, the response of plants to different envi- 
ronmental conditions will have to be assessed. Finally, data 
collection and processing has to be streamlined to allow effi- 
cient and reliable assignment of phenotypes to individual 
plants. This will have to include the capability of following 
and documenting the fate (and phenotype) of each individual 
plant from sowing to disposal, preferably in an automated 
way. New resources for the computational analysis of im- 
ages and data sets are important. In addition, standardized 
storage of data that allows systematic interpretation of the 
phenotyping results is necessary. The use of ontology terms 
to describe the phenotypes - like the use of Gene Ontology to 
describe biological processes, molecular functions and cellu- 
lar components - will have to be employed. 

In spite of these many challenges, the prediction that ge- 
nome-wide phenotypic data will soon become available for 
Arabidopsis does not appear to be all that far-fetched. Its 
realisation will no doubt stimulate a wealth of in-depth gene- 
function analyses. 
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